Goto

Collaborating Authors

 touch gesture


Touch Speaks, Sound Feels: A Multimodal Approach to Affective and Social Touch from Robots to Humans

Ren, Qiaoqiao, Belpaeme, Tony

arXiv.org Artificial Intelligence

Affective tactile interaction constitutes a fundamental component of human communication. In natural human-human encounters, touch is seldom experienced in isolation; rather, it is inherently multisensory. Individuals not only perceive the physical sensation of touch but also register the accompanying auditory cues generated through contact. The integration of haptic and auditory information forms a rich and nuanced channel for emotional expression. While extensive research has examined how robots convey emotions through facial expressions and speech, their capacity to communicate social gestures and emotions via touch remains largely underexplored. To address this gap, we developed a multimodal interaction system incorporating a 5*5 grid of 25 vibration motors synchronized with audio playback, enabling robots to deliver combined haptic-audio stimuli. In an experiment involving 32 Chinese participants, ten emotions and six social gestures were presented through vibration, sound, or their combination. Participants rated each stimulus on arousal and valence scales. The results revealed that (1) the combined haptic-audio modality significantly enhanced decoding accuracy compared to single modalities; (2) each individual channel-vibration or sound-effectively supported certain emotions recognition, with distinct advantages depending on the emotional expression; and (3) gestures alone were generally insufficient for conveying clearly distinguishable emotions. These findings underscore the importance of multisensory integration in affective human-robot interaction and highlight the complementary roles of haptic and auditory cues in enhancing emotional communication.


Exploring Mobile Touch Interaction with Large Language Models

Zindulka, Tim, Sekowski, Jannek, Lehmann, Florian, Buschek, Daniel

arXiv.org Artificial Intelligence

Interacting with Large Language Models (LLMs) for text editing on mobile devices currently requires users to break out of their writing environment and switch to a conversational AI interface. In this paper, we propose to control the LLM via touch gestures performed directly on the text. We first chart a design space that covers fundamental touch input and text transformations. In this space, we then concretely explore two control mappings: spread-to-generate and pinch-to-shorten, with visual feedback loops. We evaluate this concept in a user study (N=14) that compares three feedback designs: no visualisation, text length indicator, and length + word indicator. The results demonstrate that touch-based control of LLMs is both feasible and user-friendly, with the length + word indicator proving most effective for managing text generation. This work lays the foundation for further research into gesture-based interaction with LLMs on touch devices.


Touched by ChatGPT: Using an LLM to Drive Affective Tactile Interaction

Ren, Qiaoqiao, Belpaeme, Tony

arXiv.org Artificial Intelligence

Touch is a fundamental aspect of emotion-rich communication, playing a vital role in human interaction and offering significant potential in human-robot interaction. Previous research has demonstrated that a sparse representation of human touch can effectively convey social tactile signals. However, advances in human-robot tactile interaction remain limited, as many humanoid robots possess simplistic capabilities, such as only opening and closing their hands, restricting nuanced tactile expressions. In this study, we explore how a robot can use sparse representations of tactile vibrations to convey emotions to a person. To achieve this, we developed a wearable sleeve integrated with a 5x5 grid of vibration motors, enabling the robot to communicate diverse tactile emotions and gestures. Using chain prompts within a Large Language Model (LLM), we generated distinct 10-second vibration patterns corresponding to 10 emotions (e.g., happiness, sadness, fear) and 6 touch gestures (e.g., pat, rub, tap). Participants (N = 32) then rated each vibration stimulus based on perceived valence and arousal. People are accurate at recognising intended emotions, a result which aligns with earlier findings. These results highlight the LLM's ability to generate emotional haptic data and effectively convey emotions through tactile signals. By translating complex emotional and tactile expressions into vibratory patterns, this research demonstrates how LLMs can enhance physical interaction between humans and robots.


Sound-Based Recognition of Touch Gestures and Emotions for Enhanced Human-Robot Interaction

Hou, Yuanbo, Ren, Qiaoqiao, Wang, Wenwu, Botteldooren, Dick

arXiv.org Artificial Intelligence

Emotion recognition and touch gesture decoding are crucial for advancing human-robot interaction (HRI), especially in social environments where emotional cues and tactile perception play important roles. However, many humanoid robots, such as Pepper, Nao, and Furhat, lack full-body tactile skin, limiting their ability to engage in touch-based emotional and gesture interactions. In addition, vision-based emotion recognition methods usually face strict GDPR compliance challenges due to the need to collect personal facial data. To address these limitations and avoid privacy issues, this paper studies the potential of using the sounds produced by touching during HRI to recognise tactile gestures and classify emotions along the arousal and valence dimensions. Using a dataset of tactile gestures and emotional interactions from 28 participants with the humanoid robot Pepper, we design an audio-only lightweight touch gesture and emotion recognition model with only 0.24M parameters, 0.94MB model size, and 0.7G FLOPs. Experimental results show that the proposed sound-based touch gesture and emotion recognition model effectively recognises the arousal and valence states of different emotions, as well as various tactile gestures, when the input audio length varies. The proposed model is low-latency and achieves similar results as well-known pretrained audio neural networks (PANNs), but with much smaller FLOPs, parameters, and model size.


Clustering Social Touch Gestures for Human-Robot Interaction

Chahine, Ramzi Abou, Vasquez, Steven, Fazli, Pooyan, Seifi, Hasti

arXiv.org Artificial Intelligence

Social touch provides a rich non-verbal communication channel between humans and robots. Prior work has identified a set of touch gestures for human-robot interaction and described them with natural language labels (e.g., stroking, patting). Yet, no data exists on the semantic relationships between the touch gestures in users' minds. To endow robots with touch intelligence, we investigated how people perceive the similarities of social touch labels from the literature. In an online study, 45 participants grouped 36 social touch labels based on their perceived similarities and annotated their groupings with descriptive names. We derived quantitative similarities of the gestures from these groupings and analyzed the similarities using hierarchical clustering. The analysis resulted in 9 clusters of touch gestures formed around the social, emotional, and contact characteristics of the gestures. We discuss the implications of our results for designing and evaluating touch sensing and interactions with social robots.


Deep Learning Classification of Touch Gestures Using Distributed Normal and Shear Force

Choi, Hojung, Brouwer, Dane, Lin, Michael A., Yoshida, Kyle T., Rognon, Carine, Stephens-Fripp, Benjamin, Okamura, Allison M., Cutkosky, Mark R.

arXiv.org Artificial Intelligence

When humans socially interact with another agent (e.g., human, pet, or robot) through touch, they do so by applying varying amounts of force with different directions, locations, contact areas, and durations. While previous work on touch gesture recognition has focused on the spatio-temporal distribution of normal forces, we hypothesize that the addition of shear forces will permit more reliable classification. We present a soft, flexible skin with an array of tri-axial tactile sensors for the arm of a person or robot. We use it to collect data on 13 touch gesture classes through user studies and train a Convolutional Neural Network (CNN) to learn spatio-temporal features from the recorded data. The network achieved a recognition accuracy of 74% with normal and shear data, compared to 66% using only normal force data. Adding distributed shear data improved classification accuracy for 11 out of 13 touch gesture classes.


Lawson

AAAI Conferences

Touch can be a powerful means of communication especially when it is combined with other sensing modalities, such as speech. The challenge on a humanoid robot is to sense touch in a way that can be sensitive to subtle cues, such as the hand used and amount of force applied. We propose a novel combination of sensing modalities to extract touch information. We extract hand information using the Leap Motion active sensor, then determine force information from force sensitive resistors. We combine these sensing modalities at the feature level, then train a support vector machine to recognize specific touch gestures. We demonstrate a high level of accuracy recognizing four different touch gestures from the firefighting domain.


Narber

AAAI Conferences

Nonverbal communication is a critical way for humans to relay information and can have many forms including hand gestures, touch, and facial expressions. Our work focuses on touch gestures. In typical systems the recognition process does not begin until after the communication has completed, which can create a delayed response from the robot. It may take time for the robot to plan the appropriate response to touch, which could delay the reaction time. We have trained an artificial neural network on features extracted from the Leap Motion Controller, and successfully performed early recognition of touch gestures with high accuracy.


JBL Live 300TWS review: Feature-rich earbuds for the not-so-rich

PCWorld

Considering the JBL Live 300TWS's price and relatively simple design, I expected an equally simple experience--but I shouldn't have judged a pair of true wireless earbuds on its subdued looks. A couple of weeks on, the Live 300TWS's features still dazzle me, and the music quality is far better than I expected from buds in this price range. The 2.7-inch-wide case alone looks as if it could come from a premium set of earbuds. Smooth, compact, and pocketable, its notification lights have a chic air reminiscent of luxury goods: A ring around the USB-C port along the bottom pulses white while the earbuds charge inside--the better to see where to stick the plug in low light--and it shifts to red once the cable is in. The act of opening the case could be more elegant, though; eventually, I fumbled far less with the lid after learning to pull it up from the sides.


iOS 10: bigger emojis, better Siri and facial recognition coming to your iPhone

The Guardian

The next version of Apple's software for the iPhone and iPad, iOS 10, will feature enhanced 3D touch features, expanded Siri and an improved lock screen plus overhauls to Photos, Music and Messages. Apple has also improved notifications, allowing apps to provide rich notifications that are activated via 3D touch gestures in the notification pane, as well as the widget pane and via the lock screen. Siri is now open to third-party developers, which means apps like WeChat can be accessed straight from the voice-control window. While the Apple QuickType keyboard now has part of Siri's machine learning to allow it to predict your responses based on what is happening on the rest of the phone. Apple Photos now has facial, scene and object recognition built in, which is performed on device, as well as the ability to view your photos on a map and to automatically group photos for topics, trips, people and other activities in what Apple calls "Memories".